FILTER MODE ACTIVE

#reinforcement learning

Records found: 96

#reinforcement learning19/10/2025

Weak-for-Strong: Training a 7B Meta-Agent to Orchestrate Powerful LLMs

'W4S trains a 7B meta-agent to program Python workflows that call stronger LLM executors, using offline RL to iteratively generate, execute, and refine solutions. The approach yields consistent gains across 11 benchmarks and achieves Pass@1 of 95.4 on HumanEval with GPT-4o-mini.'

#reinforcement learning18/08/2025

How Pigeons Laid the Groundwork for Modern AI

'Midcentury pigeon experiments by B.F. Skinner inspired the associative learning ideas that underpin modern reinforcement learning, reshaping both AI and how scientists view animal intelligence.'